Describing Linde’s Dictionary of Polish for Digitalisation Purposes
نویسنده
چکیده
The present paper describes the attempts at digitalising the so called Linde’s dictionary of Polish published in 6 volumes between 1807 and 1814 by Samuel Bogumił Linde. We are working on a formal description of the dictionary’s structure, whose purpose will be to allow programmers to design a tool for automatic tagging of the text. The dictionary is multilingual, so performing OCR with good quality is a difficult task. The paper also describes the indexes that are going to be added. Compiling an a tergo index and indexes of abbreviations, qualifiers and the names of quotation authors would improve the quality and usefulness of the digitalised version. Our work with the 2 edition of the dictionary (1854-1861) allows us to test several tools (in different stages of development) that are being developed within the framework of a Polish government grant directed by Janusz S. Bień.
منابع مشابه
Multisłownik: Linking plWordNet-based Lexical Data for Lexicography and Educational Purposes
Multisłownik is an automated integrator of Polish lexical data retrieved from multiple available online sources intended to be used in various scenarios requiring access to such data, most prominently dictionary creation, linguistic studies and education. In contrast to many available internet dictionaries Multisłownik is WordNet-centric, capturing the core definitions from Słowosieć, the Polis...
متن کاملPolish Word Sketches
Word sketches are one-page automatic, corpus-based summaries of a word's grammatical and collocational behaviour. They were first used in the production of the Macmillan English Dictionary (Rundell 2002). At that point, word sketches only existed for English. Today, the Sketch Engine is available, a corpus tool which takes as input a corpus of any language and corresponding grammar patterns and...
متن کاملA New Dictionary Construction Method in Sparse Representation Techniques for Target Detection in Hyperspectral Imagery
Hyperspectral data in Remote Sensing which have been gathered with efficient spectral resolution (about 10 nanometer) contain a plethora of spectral bands (roughly 200 bands). Since precious information about the spectral features of target materials can be extracted from these data, they have been used exclusively in hyperspectral target detection. One of the problem associated with the detect...
متن کاملA Relational Model of Polish Inflection in Grammatical Dictionary of Polish
The subject of this article is a description of Polish inflection in the form of a relational database. The description has been developed for a grammatical dictionary of Polish that aims at complete inflectional characterisation of all Polish lexemes. We show some complexities of the Polish inflectional system for various grammatical classes. Then we present a relatively compact relational mod...
متن کاملThe on-line version of Grammatical Dictionary of Polish
We present the new online edition of a dictionary of Polish inflection – the Grammatical Dictionary of Polish (http://sgjp.pl). The dictionary is interesting for several reasons: it is comprehensive (over 330,000 lexemes corresponding to almost 4,300,000 different textual words; 1116 handcrafted inflectional patterns), the inflection is presented in an explicit manner in the form of carefully d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011